A Feature Extraction Algorithm of Brain Network of Motor Imagination Based on a Directed Transfer Function

Aiming at the feature extraction of left- and right-hand movement imagination EEG signals, this paper proposes a multichannel correlation analysis method and employs the Directed Transfer Function (DTF) to identify the connectivity between different channels of EEG signals, construct a brain network, and extract the characteristics of the network information flow. Since the network information flow identified by DTF can also reflect indirect connectivity of the EEG signal networks, the newly extracted DTF features are incorporated into the traditional AR model parameter features and extend the scope of feature sets. Classifications are carried out through the Support Vector Machine (SVM). The classification results show the enlarged feature set can significantly improve the classification accuracy of the left- and right-hand motor imagery EEG signals compared to the traditional AR feature set. Finally, the EEG signals of 2 channels, 10 channels, and 32 channels were selected for comparing their different effects of classifications. The classification results showed that the multichannel analysis method was more effective. Compared with the parameter features of the traditional AR model, the network information flow features extracted by the DTF method also achieve a higher classification effect, which verifies the effectiveness of the multichannel correlation analysis method.


Introduction
Brain-computer interface (BCI) technology can realize direct information interactions between brains and the outside world. Among the input signals of BCI systems, the motor-imaging EEG signals have been widely used in the BCIs as spontaneous EEG signals. As well known, the motor imaging BCI has an important application value in the medical field. For example, it can be used for rehabilitation training of patients with motor dysfunction [1] or for controlling wheelchairs and robotic arms through motor imaging [2,3]. e function can bring a great convenience to the patients' lives. In addition, BCI technology also has a wide range of applications in human's daily lives: robot control, smart home, computer games, and so on [4][5][6].
When performing motor imagination, it usually causes event-related desynchronization (ERD) and event-related synchronization (ERS) phenomena in the brain [7]. us, different features can be extracted according to the different EEG signals generated by different motor imaginations so that the features can be further used to identify the types of EEG signals of different motor imaginations. Commonly used feature extraction algorithms include co-space mode, wavelet transform, autoregressive model, power spectrum estimation, and other algorithms [8]. ese algorithms generally extract features from a single channel of EEG signal and less consider the correlation between different channels. At present, more and more researches use the method of correlation analysis to identify the information flow between different channel signals in the motor imagery EEG signal to construct a brain network and extract network features. e neural connectivity of the brain networks is usually investigated in three aspects: structural brain network, functional brain network, and causal brain network [9]. Structural brain networks are the physiological connections between different neurons, which represent the physiological synaptic structures of the brain. Functional brain networks represent the statistical-based associations of biological neural networks and do not consider the flow of information and causal connections between nodes. Causal brain networks reflect the information transmission and interaction between different areas of the brain and is a directed network. In an effective brain network, effective connectivity shows the direction and intensity of information flow between different brain regions [10].
Some studies have used the connectivity between different areas of the brain to analyse motor imagery EEG signal. Daly et al. used modal decomposition to analyse the phase synchronization between the Intrinsic Mode Functions (IMF) of all channels of EEG data and built the brain's information connection networks [11]. e clustering coefficient is used as the feature of the connectivity between different channels to classify the motor imagery EEG signals, and good experimental results are obtained. Billinger et al. calculated the connectivity between various brain regions through the direction transfer function and used this as a feature vector to determine the category a motor imaging EEG signal belongs to. It is proved that the motor imagery EEG signal can be classified by the direction transfer function [12]. Li et al. proposed to use the partial directional coherence (PDC) algorithm to model the motor imagery causal effect networks and used the network parameters as features to realize the classification of the left-and righthand motor imagery EEG signals [13]. Luo et al. proposed a feature extraction method combining brain function network and sample entropy [14]. In the method, the brain function network is constructed separately for the left and right hemispheres of the brain. e node degree and average clustering coefficient of the network are used as the characteristics of the brain function network. e new features are combined with the characteristics of sample entropy; an improved classification effect was achieved. e aforementioned methods commonly focused on one aspect of the networks' properties and tried to find the unique informative feature for motor imagery EEG signal classifications. However, to the knowledge of updated neurophysiology, the information transmission often involves the various emerging properties coming from the complex brains' networks. e combinations of different network features were seldomly performed and investigated. is paper studies the use of the combination of AR parameters and Directed Transfer Function (DTF) algorithm to identify the connectivity and information flow between different channels and different brain regions in motor imagination. As known, the network information flow identified by the traditional method AR model can only represent the direct connectivity of the EEG signal network, which shows a lack of network structure features. Considering that the network information flow identified by DTF can also reflect indirect connectivity of the EEG signal networks, the newly extracted DTF features are incorporated into the traditional AR model parameter features and the scope of feature sets is extended. At the end, classifications are carried out through the Support Vector Machine (SVM). e classification results show the enlarged feature set can significantly improve the classification accuracy of the leftand right-hand motor imagery EEG signals. e DTF does not require a priori knowledge about network connectivity and is highly resistant to noise, which makes the method highly appropriate to use even in an environment with common noise. Additionally, for frequency analysis, the DTF is also resistible to constant phase disturbances and can be used even with an environment of attenuation of electrode signals and volume conduction effect, as known to be common in EEG signals.

AR Model.
e autoregressive (AR) model belongs to a stationary time series model. e basic idea is to use an autoregressive process to fit EEG signals so that fewer parameters can be used to reflect more direct connective information [15]. e AR model is an all-pole model. In the time series, the information at the next moment can be weighted by the information at the previous moment and the current information. e time series of EEG data can be best fitted by selecting the appropriate order, which can be expressed by the following difference equation [16]: (1) In the formula, ε(n) represents white noise with a mean value of zero and a variance of σ 2 , p is the order, and a k is the coefficient of the model. When calculating AR model parameters, Burg algorithm is a commonly used algorithm. e algorithm is an autoregressive power spectrum estimation method, which can be estimated and calculated with less data, and the estimation result is very close to the true value. However, when this algorithm is used to process high-order models and the data is long, the estimation precision becomes lower, and the phenomenon of spectral line shift and spectral split may occur [17].

Directed Transfer Function.
e DTF is used to determine the direction of information transfer between different variables in multivariable signals and show a good robustness. It can especially correctly identify the direction of information transfer in multivariable signals with a noisy. erefore, the method is often used to determine the direction of information flow between time series in order to identify the direction of information flow of EEG signals in different channels and different brain sections.
Let vector S(t) � [s 1 (t), s 2 (t), . . . , s N (t)] T denote Nchannel EEG signals at time t. Fit the data set S through a multivariate autoregressive model [18]: In (1), E(t) � [e 1 (t), e 2 (t), . . . , e N (t)] T represents a white noise vector with a zero-mean white noise, and Λ k is the N × N matrix of model coefficients. e notation p is the order of the model, which can be calculated by the AIC criterion. In order to study the nature of its frequency domain, the above formula can be transformed to the frequency domain [19]: Among them, f denotes a specific frequency and H (f ) is the transfer matrix of the system from a white noise to the network response. Suppose all the outputs of the networks come from the white noise, which are independent from each channel.
us, one element of H(f) represents the connections between the jth input and the ith output of the system. e transfer matrix is defined as where I is an identity matrix and Δt is the time interval between two samples. A normalized form of H(f), named as DTF, is often used to determine the relationship between two signals in all signals of a network as [20] c where c ij (f) describes the ratio between the inflow from signal j to signal i and all inflows to signal i. Naturally, the value of DTF elements is in the interval of [0,1]. If c ij (f) � 1, it means that all of the information in signal i is composed of information from signal j. If c ij (f) � 0, it means that there is no information flowing into signal i in signal j.

Support Vector
Machine. Support vector machines (SVM) have a wide range of applications in pattern recognition. erefore, SVM can be applied to the situation where the input feature vector is both linearly separable and nonlinearly separable. When the problem of pattern recognition is linearly separable, the main idea is to find a classification hyperplane to maximize the feature distribution spacing on both sides of the hyperplane. In the nonlinear separable case, the basic theory is to map the feature vector to a high-dimensional space through nonlinear transformation. en, the classification hyperplane is obtained so that the input feature vector becomes linearly separable.
us, this hyperplane maximizes the distance between the two types of feature distributions [21].
Set the sample set as When the samples are linearly separable, the positive and negative sample sets will be separated. e constraint conditions for the classification of positive and negative sample sets are expressed by the following formula [22]: where w is the normal vector of the classification surface and b is the classification threshold.
For linearly separable training samples, the classification function is In (7), sgn denotes the symbolic function. For nonlinear problems, it is necessary to map it to a high-dimensional linearly separable space through an appropriate kernel function. Transform the data from the original space to a high-dimensional feature space. en, the linear classification after the nonlinear transformation can be accomplished. At this time, the corresponding discriminant function becomes [21] f where l is the number of support vectors, a i is a Lagrangian multiplier, and K(x i · x) is a kernel function. Choosing different kernel functions has an important influence on the classification effect of SVMs. rough experimental tests, SVMs with linear kernel functions have achieved very good results in pattern classification of motor imagery EEG signals. In the following calculations, the average classification accuracy rate is obtained through the tenfold cross-validation and multiple calculations.

Data Collection.
e acquisition of motor imaging EEG signals uses a 32-channel EEG equipment from BP Inc. in Germany. e 32 channels are all electrodes of the 32-channel EEG cap distributed according to the International 10-20 System Standards. Before the experiment, a conductive paste was injected into the electrodes of an EEG cap so that the electrode impedance was below 5 kilohms. en, the sampling frequency was adjusted to 500 Hz. e experiment process is carried out in a quiet environment. e signal acquisition timing diagram is shown in Figure 1. e display interface is blank within 0 to 2 seconds. During that period, a subject relaxes and does not engage in thinking activities. Later within 2-3 seconds, a prompt symbol appears on the display interface to remind the subject to start imagining the movement of the left or right hand. In the future 3 to 9 seconds, an arrow appears on the display interface with a left or right direction. At this time, the participant imagines the movement of the left or right hand from the difference in the direction of the arrow. e experiment was divided into 4 groups, and each group performed 20 times of motor imagery. ere are 7 subjects participating in the experiment, all around 25 years old and in good health. Eighty sets of data were collected for each person.

Results and Discussion
In the collected 32-channel EEG signals, 10 channels of EEG data of F3, F4, C3, C4, Fz, Cz, FC1, FC2, FC5, and FC6, which are in or near to the motor brain area, are selected for processing. e DTF is used to identify the connectivity between different channels. e frequency components of 10 Hz, 15 Hz, 20 Hz, 25 Hz, and 30 Hz are calculated for the motor imagery EEG signals. Initially, we have performed the Computational Intelligence and Neuroscience Butterworth filtering of the primitive EEG data and then used the filtered data for the further feature extractions. However, for most data, the classification correct ratio went down 5% than that of unfiltered data does. Since the filtering effect of EEG data is not satisfying, the paper directly used the unfiltered data for the further feature extraction and task classifications. e DTF values between each electrode data is used as features, and then the aforementioned SVM is used for classifications. en, the training sets (50% of the total data) and test sets (50% of the total data) were randomly selected and tenfold cross-validations were performed. e average classification accuracy rate is shown in Table 1.
As can be seen from the classification results, there are certain differences in the classification results of brain network characteristics of EEG signals with different frequencies in all subjects' motor imagination EEG signals. For the EEG signals of different subjects, the classification accuracies also show many individual differences. Subject 4 shows the highest classification accuracies, while subject 6 shows rather low classification accuracies for each frequency. Most subjects' motor imagination EEG signals reach the highest classification accuracy at 25 Hz and 30 Hz frequencies. erefore, the DTF network features of the EEG signals calculated for 30 Hz were fused with the parameter features of AR model. e average classification accuracy both before and after feature fusion were shown in Table 2: It can be seen from the classification results that for almost all subjects (except subject 1), the recognition results have been improved significantly after DTF feature fusion. For subject 1, the recognition accuracy of the fused features is lower than that of AR even if it is still higher than that of DTF.
Furtherly, all the 32-channel EEG signals were used to identify the connectivity between each channel. en, the DTF values were also involved as features to classify the different tasks. e average classification accuracy rate is shown in Table 3.
It is easy to be found that the classification accuracies have been greatly improved with the involvement of more channel signals. It also shows that in the processes of motor imagination, there emerges more connectivity of information flow in a larger area of the brains. e longer distant connectivity may contribute significantly to classifying the different tasks.
For the 32-channel EEG signals, the DTF network features of the EEG signals calculated for 30 Hz were also fused with the parameter features of AR model. e average classification accuracies for the three methods were shown in Table 4.
Comparing Table 4 with 2, we can see the classification accuracies have been significantly increased from 10channel signals to 32-channel signals. It can also be found that the fused feature extraction method achieved higher accuracies than AR or DTF did in most cases, except for  e experimental sequence diagram. e first two seconds are in an idle state. As a reminder for starting motor imagery of the left or right hand, a prompt symbol appears on the screen during the 2nd to the 3rd second. After that, an arrow to the left or right is displayed on the screen in the 3rd to the 9th second.   Computational Intelligence and Neuroscience some few cases of subjects 2, 5, and 6. For example, the classification accuracies of AR + DTF are lower than those of AR for subjects 2 and 5, while the classification accuracies of AR + DTF are lower than those of DTF only for subjects 6. e exception cases can be explained by the individual performances of different subjects.
It can be seen from the experimental results that for the 10channel EEG signal and the 32-channel EEG signal, the classification accuracy after the fusion of the single-channel feature and the network feature has been improved. e combination of direct network information flow and indirect network information flow is used when extracting network features. Consider not only the interaction information between neighbouring electrodes, but also the interaction information between electrodes that are farther apart. e singlechannel features extracted by traditional feature extraction algorithms are combined with the network features constructed in this paper. e fused features improve the classification accuracy of motor imagery EEG signals, especially in the case of data analysis with few channels, the recognition accuracy of the fusion method is significantly improved.
In order to verify the influence of the number of channels selected for EEG signals on the processing results, the EEG signals of 7 subjects were processed using the main C3 and C4 channels of the left and right hemispheres of the brain, as well as the abovementioned 10-channel and 32-channel EEG signals. e features are extracted through the AR model, and the average classification accuracy is shown in Figure 2. e characteristics of network information flow are extracted through DTF, and the average classification accuracy is shown in Figure 3.
As can be seen from Figures 2 and3, for the features extracted by the AR model and the features extracted by the DTF algorithm, as the number of selected channels increases, the classification accuracy rate gradually increases. It shows that when performing left-and right-hand movement imagination, the brain areas exhibiting significant EEG signal correlations outside of the areas of C3 and C4 electrodes, that is, the EEG signals of other channels, also have shown more wider information flows between traditionally observed brain areas and other neighbouring channels. As the number of selected channels increases, the classification accuracy of EEG signals can be improved. When the number of channels is large, the features extracted by analysing the brain network information flow method also achieve better results compared with the traditional methods, and the classification accuracy of the motor imagery EEG signal is improved after the feature fusion.

Conclusions
In the paper, the features of motor imagery EEG signals are extracted from the connective characteristics of brain     e average classification accuracies with the DTF features from the 7 subjects. e EEG signals of 2 channels, 10 channels, and 32 channels are selected for processing; then, DTF is used to extract network information flow characteristics. After that, the SVM is used for classification, and the average classification accuracies of each group of data are calculated here.
networks. e network information flows between different channels and different brain regions are identified and the extracted brain network features are used for further task classification. e network features are then combined with the traditional AR model parameter features. e new feature combination improves the classification accuracies for the left-and right-hand motor imagery EEG signals. In addition, with more channel signals involved, the classification accuracies of motor imagery EEG signals have also been improved significantly. e result shows that there is a wide range of network information flows between different channels and different regions in the brains. ese network information features can contribute to the classification accuracies of motor imagery EEG signals.
Accurate identification of network structure requires a large amount of multichannel neuronal response data with a high temporal and spatial resolution. As known, the data with electroencephalography (EEG) usually has a satisfied temporal resolution but has a low spatial resolution. erefore, the number of the commands represented by motor imagery EEG signals is limited in view of the low spatial resolution of EEG signals. To overcome the drawback, some other existing measurement methods, such as functional Magnetic Resonance Imaging (fMRI) and Invasive Electrode Implantation (IEI), which have higher spatial resolutions may be the promising methods to merge the gap of limited command representation. Additionally, the other essential properties of the biological neurons are nonlinear dynamic and neuronal plasticity. Currently, there are few effective network structure reverse identification methods, which can accurately model and adapt this nonlinear and time-variant dynamic relationship. Further, a nonlinear time-variant method is required for extracting the precious network information flow features in neural networks. With the development of complex network researches, some topological characteristics of complex networks, for example, node degree and its distribution characteristics, degree correlation, agglomeration degree and its distribution characteristics, shortest distance and its distribution characteristics, node betweenness and its distribution characteristics, have already become the intriguing potential features for improving the classification accuracies for motor imagery EEG signals. In addition, the data collected in this manuscript may be subjected to some degree of noises and disturbance. A fuzzy preprocessing method [23] can be used to improve the robustness of the proposed fusion feature extraction method for analysing motor imagery EEG signals. e excellent performance of the fuzzy similarity approach proposed in [23] has been verified and confirmed for assessing the mechanical integrity of steel plates [24].

Data Availability
e EEG data of the 7 subjects used to support the findings of this study are available from the corresponding author upon request. e contact information is dongchaoyi@ hotmail.com.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.